Mini-Batch k-Means versus k-Means to Cluster English Tafseer Text: View of Al-Baqarah Chapter
نویسندگان
چکیده
Al-Quran is the primary text of Muslims’ religion and practise. Millions Muslims around world use al-Quran as their reference guide, so knowledge can be obtained from it by Islamic scholars in general. has been reinterpreted to various languages world, for example, English written several translators. Each translator ideas, comments statements translate verses which he (Tafseer). Therefore, this paper tries cluster translation Tafseer using clustering. Text clustering mining method that needs clustered same section related documents. The study adapted (mini-batch k-means k-means) algorithms techniques explain define link between keywords known features or concepts Al-Baqarah chapter 286 verses. For dataset, data preprocessing extraction Term Frequency-Inverse Document Frequency (TF-IDF) Principal Component Analysis (PCA) applied. Results showed two/three-dimensional plotting assigning seven categories (k = 7) Tafseer. implementation time mini-batch algorithm (0.05485s) outperformed (0.23334s). Finally, ‘god’, ‘people’, ‘believe’ was most frequent features.
منابع مشابه
Nested Mini-Batch K-Means
A new algorithm is proposed which accelerates the mini-batch k-means algorithm of Sculley (2010) by using the distance bounding approach of Elkan (2003). We argue that, when incorporating distance bounds into a mini-batch algorithm, already used data should preferentially be reused. To this end we propose using nested mini-batches, whereby data in a mini-batch at iteration t is automatically re...
متن کاملTurbocharging Mini-Batch K-Means
A new algorithm is proposed which accelerates the mini-batch k-means algorithm of Sculley (2010) by using the distance bounding approach of Elkan (2003). We argue that, when incorporating distance bounds into a mini-batch algorithm, already used data should preferentially be reused. To this end we propose using nested mini-batches, whereby data in a mini-batch at iteration t is automatically re...
متن کاملK-means vs Mini Batch K-means: A comparison
Mini Batch K-means ([11]) has been proposed as an alternative to the K-means algorithm for clustering massive datasets. The advantage of this algorithm is to reduce the computational cost by not using all the dataset each iteration but a subsample of a fixed size. This strategy reduces the number of distance computations per iteration at the cost of lower cluster quality. The purpose of this pa...
متن کاملComparing k-means clusters on parallel Persian-English corpus
This paper compares clusters of aligned Persian and English texts obtained from k-means method. Text clustering has many applications in various fields of natural language processing. So far, much English documents clustering research has been accomplished. Now this question arises, are the results of them extendable to other languages? Since the goal of document clustering is grouping of docum...
متن کاملFaster K-Means Cluster Estimation
K-means is a widely used iterative clustering algorithm. There has been considerable work on improving k-means in terms of mean squared error (MSE) and speed, both. However, most of the k-means variants tend to compute distance of each data point to each cluster centroid for every iteration. We propose two heuristics to overcome this bottleneck and speed up k-means. Our first heuristic predicts...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Quranic Sciences and Research
سال: 2021
ISSN: ['2773-5532']
DOI: https://doi.org/10.30880/jqsr.2021.02.02.006